Speech formant frequency estimation: evaluating a nonstationary analysis method

نویسندگان

  • Preeti Rao
  • A. Das Barman
چکیده

The objective of this paper is to critically evaluate the performance of a nonstationary analysis method in tracking speech formant frequencies as they change with time due to the natural variations in the vocal-tract system during speech production. The method of instantaneous frequency estimation is applied to the tracking of speech formant frequencies to observe the time variations in the vocal-tract system characteristics within a pitch period. An implementation of an instantaneous frequency estimator based on the source}"lter model of speech production is described for voiced speech formants. Based on experimental results from simulated as well as natural speech data, it is shown that the accuracy of the frequency estimates is heavily dependent on the nature of the glottal excitation waveform, the fundamental frequency and the frequency spacing of the formants in the speech signal. The choice of various analysis parameters on the accuracy of the estimates is discussed. It is shown that only when the formants are well separated and there are distinct regions of the glottal cycle in which the source excitation can be considered to be negligible, does the instantaneous frequency estimate accurately represent the actual formant frequency. Experimental results on natural speech vowels which show di!erences in formant frequencies in the di!erent phases of the glottal cycle are presented. ( 2000 Elsevier Science B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...

متن کامل

A method for glottal formant frequency estimation

This study presents a method for estimation of glottal formant frequency (Fg) from speech signals. Our method is based on zeros of z-transform decomposition of speech spectra into two spectra : glottal flow dominated spectrum and vocal tract dominated spectrum. Peak picking is performed on the amplitude spectrum of the glottal flow dominated part. The algorithm is tested on synthetic speech. It...

متن کامل

Tracking of involuntary formant frequency variations and application to parkinsonian speech

The objective of this paper is to present a formant frequency estimation method, developed with a view to track small variations due to involuntary vocal tract movement. The formant frequency estimation is based on the instantaneous frequencies obtained by means of a complex wavelet transform and is synchronised with the glottal cycle. Results for synthetic speech signals show the precision of ...

متن کامل

Hierarchical approach to formant detection and tracking through instantaneous frequency estimation - Electronics Letters

Formant frequencies, represented by major peaks in the spectrum of speech signals, convey important information about speech. The authors propose a method for detecting the formants of voiced speech through ‘instantaneous frequency’ (IF) estimation using a recursive least square (RLS) algorithm. The accuracy of the technique is assessed by comparing it with conventional formant detection techni...

متن کامل

Formant model estimation and transformation for voice morphing

In this paper we consider the estimation and mapping of timevarying formant model parameters and orders for voice transformation. The model order is the number of perceptually significant formant trajectories estimated from an analysis of the poles of “over-modelled’’ linear prediction models of the source and target speech. A 2-D HMM with NF left–to-right states across frequency and M states a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Signal Processing

دوره 80  شماره 

صفحات  -

تاریخ انتشار 2000